CSE 373: Data Structures and Algorithms

Winter 2000

Programming Assignment 3 Due March 10, 2000

 

REVISED ON FEBRUARY 23, 2000

 

Read this entire assignment sheet before asking questions.

 

Please include your name and student id # on what you turn in.  What you turn in must include a listing of the program, and output data.  As the deadline nears we will provide you with a set of sample input data.  Also email your program listing to c373@ms.washington.edu

 

The goal of this programming assignment is to explore some sorting problems and to appreciate that some problems, even as simple as sorting and searching, do not always have elegant solutions.

 

Lotto

 

Your assignment is fashioned a bit after the state’s Lotto program.  For those unfamiliar with Lotto, it is legalized state sponsored gambling where for $1.00 you can choose 6 numbers from 1 to 55, then twice a week 6 winning numbers at picked at random.  If your entry matches all six numbers, you win.  If they match five out of the six you also win something.  Four out of six wins a little less, and three out of six wins even less.

 

The actual programming assignment

 

Your third programming assignment is to write a program that first takes as input from a file at most 10,000,000 Lotto entries, then the winning combination, and then it is to determine which entries match all six, five, four, or three numbers.  Each entry will be identified with a unique ID number.

 

The input is organized with one entry per line.  Each entry contains 7 numbers.  The first number is the entry ID and is followed by its 6 Lotto numbers.  For example,

 

     10000 3 7 11 13 17 23

     10001 1 2 4 8 16 32

     10002 55 54 53 52 51 50

          :

     65000 37 33 29 53 32 45

     00000 1 2 3 4 5 6

 

The last entry in the file will have an ID# of 00000 and will be the winning combination.

 

Choose an appropriate internal data representation to solve this problem and describe clearly in your documentation/comments how the program works.  Your program must output the following information.

 

1.      The winning combination and the count of the total number of entries.

2.      The number of entries that matched all six numbers and their ID# one per line

3.      The number of entries that matched 5 of the 6 numbers and their ID# one per line.  Also print out all six numbers that were submitted for this entry.

4.      The number of entries that matched 4 of the 6 numbers and their ID# one per line.  Also print out all six numbers that were submitted for this entry.

5.      The number of entries that matched 3 of the 6 numbers and their ID# one per line.  Also print out all six numbers that were submitted for this entry.

6.      Lastly print out the total running time for the program (see the note below on how to do this).

 

Try and have the program run fast but keep the code crisp and clean.  This is an odd sorting problem and we’ll talk about various approaches to this problem in class.

 

I do not believe that there is a “right” solution to this problem.  I really want you to think about the problem and how to solve it.  Conceptually it is an easy problem to understand but coming up with an elegant solution is a lot trickier.

 

How to time your program

 

To add timers to your program to see how fast it really runs is fairly simple.  First we add a call at the start of the program to get the start time and then we add a second call at the end of program to the ending time.  Then we subtract the start time from the ending time and we print out the difference.  To be fair everyone must start the clock at the beginning of the program (before anything is read in or initialized), and stop the timer right before the program is to terminate. The following code fragment demonstrates how to get the start and stop time and print it out.

 

#include <stdio.h>

#include <time.h>

 

void

main (int argc, char *argv[])

{

      clock_t start, stop;

     

      start = clock();

 

      //

      //  Do the work

      //

 

      stop = clock();

 

printf("Start = %d, Stop = %d, Duration = %f seconds\n",

        start, stop, ((double)(stop - start) / CLK_TCK));

      return;

}

 

On most machines the system clock will not report anything smaller than a millisecond so don’t be surprised on small data sets that the program reports running in zero time.

 

Extra Credit

 

For extra credit have your program print out those entries sharing 5 numbers in common.  For example consider the following entry fragment.

 

410034 2 4 7 19 32 47

200170 4 7 32 19 33 2

649909 32 47 7 19 2 33

 

your program will need to print out

 

Entries with 5 numbers in common

 

#410034 #200170 share 2 4 7 19 and 32 in common

#410034 #649909 share 2 7 19 32 and 47 in common

#200170 #649909 share 2 7 19 32 and 33 in common

 

If multiple entries share all 6 numbers in common your extra credit code can treat it as if only 5 are in common and report any of the 5 numbers that match, this should simplify matters.  This will be worth a lot of extra credit if you can do this fast and efficiently.

 

Grading

 

Describe clearly in your documentation/comments how the program works including the algorithms and data structures used.  This is especially true of the extra credit portion.  Your program will be graded on correctness, clarity, neatness, and speed.  If the program is unclear and poorly documented it will receive a very poor grade.  Brute force solutions will also receive a poorer grade besides run slower.